CRYPTOSOLVE VERSION 1.0 Copyright 1996 James A. Parsly CRYPTOSOLVE is a WINDOWS based cryptogram solving utility. At its most basic level it allows you to type in a cryptogram and handles the chore of letter substitution for you. However, the real point of the program is the underlying expert system that provides hints and can even make a stab at automatic solution. -------------------------------------------------------------------- SYSTEM REQUIREMENTS This program requires Microsoft Windows If you plan to use the Hints at all, I recommend at least a 486-66. A Pentium is even better. -------------------------------------------------------------------- INSTALLATION: If you downloaded CRYPTOSOLVE, it probably came in the form of a .ZIP file, CSOLVE.ZIP. Use PKUNZIP to Un-ZIP the files onto a floppy disk. Get into WINDOWS and from the Program Manager menu select FILE and then RUN. In the Command Line box enter A:\SETUP or B:\SETUP and then click OK. This should install CRYPTOSOLVE and create a group containing the CRYPTOSOLVE icon. FILE LIST: A complete installation of CRYPTOSOLVE includes the following files: CSOLVE.EXE CSOLVE.INI VBRUN300.DLL Required for Visual Basic. Goes in WINDOWS/SYSTEM THREED.VBX Required for Visual Basic. Goes in WINDOWS/SYSTEM PATT.DAT Pattern database PATT2.DAT Rule database WORDLIST.* Word counts for various word lengths README.TXT This file REGISTER.TXT Registration form This is the complete version of CRYPTOSOLVE and is not crippled in any way. --------------------------------------------------------------------- REGISTRATION It costs $10 US Dollars to register. This gives you the right to complain and make suggestions. See REGISTER.TXT. My address is also given at the end of this document. --------------------------------------------------------------------- How CRYPTOSOLVE works (and why it sometimes doesn't) CRYPTOSOLVE's hint system is based on patterns and rules. It uses the patterns and rules to identify individual words (or author names). Each pattern or rule maps to a particular word, so CRYPTOSOLVE cannot directly identify any word that doesn't already have a pattern or rule in the database. To fill in "unknown" words the letters must be cross-referenced in "known" words. This means that a cryptogram containing a large number of "unknown" words will be difficult for CRYPTOSOLVE to solve. Someone suggested that I type in part of Jabberwocky (a nonsense poem: "TWAS BRILLIG AND THE SLITHEY TOVES....") and of course the hint system was pretty much a complete failure. When you start a new puzzle and don't know any of the solution yet, there are often many candidates for each word in the puzzle. CRYPTOSOLVE ranks the candidates based on historical frequency of occurrance. Suppose the puzzle contained the word "XABZZQ". Two words that would fit, all other things being equal, are "JOHNNY" and "REALLY". CRYPTOSOLVE knows that REALLY is about 6 times as likely to occur as JOHNNY and so the autosolver is more likely to come up with X=R than it is to come up with X=J. If the X=R is wrong and X=J is right, then the autosolver will fail. I have been training the expert system for about 6 years (1 cryptogram per day) and there are over 8000 rules in the system. The cryptograms I have been using are quotations, so the rules reflect word usage in ordinary speech and writing. Cryptograms that are purposely designed to be difficult by use of uncommon words are going to give the autosolver problems. Given all that here is how you use the CRYPTOSOLVE hint system: --------------------------------------------------------------------- SOLVING A PUZZLE USING THE HINTS SYSTEM If you are given a starting letter, click the letter tile corresponding to the puzzle letter and then type the solution letter. Bring up the HINTS page by clicking the HINTS button. Search for PATTERNS by clicking the PATTERNS button. PATTERNS are word or author names that can be recognized because they contain patterns of repeated letters. A common pattern would be WXYWZX which usually is the word PEOPLE. If any patterns are found a list will appear. You can single click a candidate to see how it looks in the puzzle. If a candidate looks good to you, either double click the candidate or hit the accept button. If you find acceptable patterns, then try the RULES button to see if there are any exact matches in the RULES database. RULES will not work if there are no known letters. Actually you need to have enough known letters to match a rule in the rule database, so you usually need more than one known letter. A rule tells CRYPTOSOLVE how to infer unknown letters in a word based on known letters. For example, if you have the puzzle word "XYZX" and you know that Y=H and Z=A, then the rule "_HA_=THAT" tells you that X=T. RULES will not work unless you know one or more letters in the puzzle, and even if you do there still may be no matches with the DATABASE. If matches are found, a list will appear showing candidates which you can try out and accept by clicking and doubleclicking, respectively. The next step is to try for a statistical solution using the AUTOSOLVE method. This works by examining multiple puzzle words. For each puzzle word examined, all possible solution words from the rule database are found. Some candidates are eliminated by looking at what would happen in cross-referenced words (for example if the solution would create a cross-referenced word that didn't have any vowels in it). Remaining candidates are assigned weights based on historical frequency of occurrance and on how closely the rule that produced the candidate was met (Buzz word: fuzzy logic). The candidate is then broken down into letters and statistics are kept on the total weight counts for all possible puzzle letter/ solution letter pairs. When all of the words have been factored in, letters that get 80% or more of the total weight for a given puzzle letter are incorporated into the solution. AUTOSOLVE can then perform another iteration taking into account any new letters in the solution. You keep running iterations until no more progress is being made. Rather than proceed automatically to a new iteration, CRYPTOSOLVE requires you to hit the GO button. This gives you a chance to SET or EXCLUDE letters before proceeding. The SET box allows you to enter a puzzle letter/solution letter pair that you have figured out. The EXCLUDE box is used to inform CRYPTOSOLVE that a puzzle letter/solution letter is not possible. This is useful when the autosolver assigns a letter based on a high probability that is nevertheless incorrect. You must exclude the puzzle letter/solution letter pairing to override the probability and allow assignment of a different letter. The PATTERNS-->RULES-->AUTOSOLVE method doesn't usually require much input from you, except to click the buttons. I was curious to see how well the program would do on its own, so I bought a puzzle book and tried out 40 cryptograms. Using just the 3 steps, CRYPTOSOLVE was able to solve 25 out of the 40. I considered a puzzle solved if I could tell what the solution was, even if one or two letters weren't filled in. (This happens where there are 2 or more candidates for a letter and none have enough weight to be included in the solution; for example you can't decide if _E is WE, ME, BE, etc. if there are no cross-referenced words to factor into the decision.) It actually did better on the first puzzles, solving 23 out of 30, but the last 10 were more difficult puzzles that didn't use common words. On its usual diet of quotations which are not purposely designed to be difficult, I am sure CRYPTOSOLVE does at least as well as the 23/30 figure. I was actually able to solve the remainder of the puzzles (except for the 40th "CHALLENGER" puzzle which turned out to be mostly unfamiliar words), just not automatically. How? See below. So what do you do if the AUTOSOLVE method doesn't work? This is where you come in. The POSSIBILITIES method allows you to doubleclick a puzzle word and look at candidates, sorted by probability. Once again you are presented with a list and you can single click a candidate to see how it looks in the puzzle. If you like a candidate, double click it (or hit the ACCEPT button). You can then go back to RULES or AUTOSOLVE and see what you get. The other way that you interact with the program is by entering new rules and patterns. As you work puzzles you will come across new words and author names that contain patterns that will be recognizable in future puzzles. I usually want at least two sets of repeated letters before I consider a pattern to be unique enough to be included in the PATTERN database. This is entirely up to you. Similarly, as you look at partial solutions you will see opportunities for new inference rules. The hint pages all have a ADD NEW RULES button that pops up a form for entering RULES and PATTERNS. This is your chance to make the program smarter, to be the expert behind the expert system, so USE IT! Patterns are just words and author names. If you think that the name RONALD REAGAN has enough pattern to be identified in the future, RONALD REAGAN is what you enter. You can look at the pattern database. It is stored as a standard text file called PATT.DAT. Rules are a bit more complicated. There are two distinct formats. The first type of rule is used when the known letters in a word allow you to infer the unknown letters. The left-hand side of the rule shows the letters that must be known and uses the underscore character "_" in positions that are unknown. The rule then has an equal sign "=" and a right-hand side which must be just like the left-hand side, except that the "_" characters are replaced by the correct letters. Examples of this type of rule: _HA_=THAT CO__ITT__=COMMITTEE V_C__M=VACUUM The second format of rule is used when there is only one unknown letter left in a word, but there are multiple letters that would fit. The left-hand side of this type of rule gives the known letters and an "_" character in the position where the unknown letter will go. The rule then has a "*" character followed by all of the possible letters. Examples of this type of rule: _F*IO GOA_*DLT If I ever write another version of CRYPTOSOLVE, I expect that I will include some additional rule types. One definite possibility is a rule type for word endings (*I_G=*ING would be an example). You can look at the RULE database. It is stored in the text file PATT2.DAT ------------------------------------------------------------------------------------------ WORD COUNTS When you have completely solved a puzzle, you should click the WORD COUNT button so that the program will update the frequency of occurrances. This will automatically happen if you try to exit the program after solving a puzzle. You will see a list of all of the words in the puzzle. If NEW WORD appears beside a word, this word is being seen for the first time. It is important that you create RULES or PATTERNS for any new words. AUTOSOLVE gets candidate words using the rule database, so if there is no rule for a word, AUTOSOLVE will not consider it. The WORD COUNT feature was the last major feature added to the original DOS program, so the word counts are still a bit low. ------------------------------------------------------------------------------------------ If Wishes were Fishes...... I wish that the program were faster. The original DOS version which was written in compiled QuickBasic 4.0 is MUCH faster than the VISUAL BASIC for Windows code. Didn't have a very good user user interface, but it did the job. Of course the machines keep getting faster and I hear rumors about there being REAL compiled code (as opposed to p-code) in a future version of Visual Basic, so maybe time will solve all of my problems. Either that or I get a C Compiler and start writing DLL's (or maybe that new POWERBASIC compiler). ------------------------------------------------------------------------------------------ The ABOUT Screen View the ABOUT screen and be amazed! I am rather proud of it. ------------------------------------------------------------------------------------------ Any Problems? Contact: James A. Parsly jparsly@usit.net 423-966-4131 624 Summit Lake Court Knoxville TN 37922